In this notebook, we will demonstrate the application of Evidential Learning on an analog to Winstershall's Concession C97-I, which is located in the N-97 field in the Western Hameimat Trough of the Sirte basin of north-central Libya. The reservoir is thus comprised of 5 distinct compartments resulting in 4 uncertain fault transmissibilities. A large aquifer is located in the lowest compartment, but the depth of this oil water contact also remains uncertain. Other uncertain reservoir parameters are relative permeabilites for the oil and water phases and oil viscosity.
We consider the situation where 5 producers and 3 injectors have already been drilled at the locations depicted in Figure 5. The field has been in production for 3500 days, and production data is available for all 5 wells. A decision needs to be made regarding the economic feasibility drilling of a 6th producer in the smallest reservoir compartment (denoted PNEW in Figure 5). Specifically, this decision will be made based on the forecasted performance of this new well over the next 4000 days. Therefore, we will seek to estimate the P10-P50-P90 forecasts of PNEW based on the first 10 years of production data from the existing 5 producers.
A prior set of models is required for CFCA to establish a statistical relationship between the data and forecast. In this case, a set of 500 prior models are generated by applying Monte Carlo. The prior models were forward modelled using a streamline simulator (3DSL) over all 7500 days, to encompass both the 3500 days of production data denoted by d, as well as the 4000 days of forecast h required to make the decision regarding the new well.
In [1]:
addpath('../src/evidential/');
addpath('../src/thirdparty/fda_matlab/');
We will next load the data. The historical and forecast data are stored as two structs with the following fields:
Name | Data Type | Dimensions | Description |
---|---|---|---|
time | double | $1 \times N_{t}$ | A vector containing the time (days, years) at each time step |
data | double | $N_{real}\times N_{t}\times N_{responses}$ | Data array that contains responses. Each row is one realization, each y is a time step, and each z is a different well |
name | string | 1 | The name of the response, ex: 'Oil Rate' |
spline | double | $2$ | First element specifies order of spline, second the number of knots |
ObjNames | CellArray | $N_{responses} \times 1$ | Contains the name of each well ex: {'P1','P2'} |
types | string | 1 | 'Historical' or 'Forecast' |
In [2]:
prior_path = '../data/evidential/Prior.mat';
load(prior_path);
For benchmarking, we will set one realization aside and use that as the truth. In this particular case, realization 150 was chosen as the reference. The production data from each of the 5 producers is shown below. The reference is shown in red $d_{obs}$, while the historical responses from the prior model are shown in gray ($d_{prior}$).
In [3]:
%plot inline -s 1600,1000
FontSize = 12;
% Set aside a realization to use as the "truth"
TruthRealization = 12;
NumPriorRealizations=length(PriorData.data);
AvailableRealizations = setdiff(1:NumPriorRealizations,TruthRealization);
PlotPriorResponses(PriorData,TruthRealization,FontSize);
%PlotPriorResponses(PriorPrediction,TruthRealization,FontSize);
Each of the response time series are technically infinite dimension, but have been discretized to 200 time steps for purposes of flow simulation. We require a dimension reduction before we can build a statistical relationship between forecast and historical data. One method such method for dimension reduction is Functional Data Analysis (ref). The basic idea is to simply express each response as a linear combination of basis functions. $$d(t) = \sum_{i=1}^K \alpha_i B_{i,n}$$ We use B-splines as basis functions in this example. Using a spline-basis has the advantage of computational ease of evaluation as well as establishing derivatives. The number of knots and order is a modeling choice and will need to be tuned for each case, usually using cross-validation. In this example, we will use a 6th order spline with 20 knots to perform FDA.
Solving for the $\alpha_i$ coefficients effectively reduces the dimension of each response curve to $K$. However, we can achieve further compression by applying Principal Component Analysis on the coefficients. This is termed Functional Princpal Component Analysis (FPCA). The effectiveness of the compression can be verified by inspecting the eigenvalues after PCA. In this case, we will look at the decomposition of Producer #1's historic oil production rate.
In [4]:
%plot inline -s 1200,800
MinEigenValues = 3; EigenTolerance = 0.97;
% We first perform FPCA on both d and h
PriorData.spline=[3 40]; % 3rd order spline with 40 knots
PriorDataFPCA = ComputeHarmonicScores(PriorData,4);
PriorPrediction.spline=[3 20]; % 3rd order spline with 20 knots
PriorPredictionFPCA = ComputeHarmonicScores(PriorPrediction,0);
In [ ]: